Bilingual Dictionary Approach for Malay-English Cross-Language Information Retrieval
نویسندگان
چکیده
Cross-language information retrieval (CLIR) is the process of providing queries in one language and returning documents relevant to that query which is written in a different language. A popular approach to CLIR is to translate the query into the language of the documents being retrieved. One of the simplest and most effective methods for query translation is to perform dictionary look up based on a bilingual dictionary. Direct translation using bilingual dictionary prune three main problems: (1) knowing how a term expressed in one language might be written in another; (2) deciding which of the possible translations should be retained and (3) deciding how to properly weight the importance of translation alternatives when more than one is retained. We evaluated the effectiveness of Malay-English CLIR system using bilingual dictionary approach. In this study, we presented the evaluation results for dictionary-based CLIR. A document collection containing newspaper articles and a related set of 35 search queries were used in this test. First, monolingual baseline queries were created manually in Malay and English languages. Secondly, queries in Malay language were automatically translated into English language, and vice versa. There are two basic translation approaches using bilingual dictionary: select the first translation listed in the dictionary and select all translations listed in the dictionary, for each query. Then, alternative weighting scheme were applied to the second query translation approach, select all translations listed in the dictionary, to enhance retrieval performance. These three experiments were evaluated using Mean Average Precision (MAP) and Average Recall-Precision graph. The results were compared to monolingual IR for Malay and English document collection, respectively.
منابع مشابه
Arabic/English Cross Language Information Retrieval Using a Bilingual Dictionary
With the increase of multilingual information available online and the increase of non-native English speaker (Arabic users) browsing the Internet, it has become more important to have information retrieval systems that can carry the retrieval process across language boundaries that is, cross language information retrieval CLIR systems. The CLIR system responds to the user query in a comprehens...
متن کاملEnglish-Chinese Cross-Language Retrieval based on a Translation Package
An inexpensive COTS translation package, augmented with a downloadable bilingual dictionary, was employed for a study of English-Chinese cross-language information retrieval (CLIR) using the query translation approach. The experimental setting involved the 170 MB Chinese collections and 54 queries of TREC and their relevance judgment, and our PIRCS bi-lingual retrieval system. With some standar...
متن کاملTranslation Term Weighting and Combining Translation Resources in Cross-Language Retrieval
In TREC-10 the Berkeley group participated only in the English-Arabic cross-language retrieval (CLIR) track. One Arabic monolingual run and four English-Arabic cross-language runs were submitted. Our approach to the cross-language retrieval was to translate the English topics into Arabic using online EnglishArabic bilingual dictionaries and machine translation software. The five official runs a...
متن کاملNTCIR-5 Chinese, English, Korean Cross Language Retrieval Experiments using PIRCS
In NTCIR-5 our focus is to see if web-assisted query expansion is useful, and to test an EnglishKorean bilingual dictionary. We participated in Chinese, Japanese, Korean and English monolingual retrieval using also web expansion for Chinese and English. We also performed Chinese-English, English-Chinese, English-Korean bilingual, and Chinese-Korean pivot bilingual CLIR. The query translation ap...
متن کاملPhrasal Translation for English-Chinese Cross Language Information Retrieval
This paper introduces a simple and effective nonoverlapping unigram and bigram segmentation method for both monolingual Chinese and English-Chinese cross language retrieval. It also describes English-Chinese cross language retrieval experiments involving 54 topics and some 164,000 documents. The translation of English queries to Chinese is done using a Chinese-English dictionary of about 120,00...
متن کامل